In silico analysis of TRPM4 variants of unknown clinical significance

Background TRPM4 is a calcium-activated channel that selectively permeates monovalent cations. Genetic variants of the channel in cardiomyocytes are associated with various heart disorders, such as progressive familial heart block and Brugada syndrome. About97% of all known TRPM4 missense variants are classified as variants of unknown clinical significance (VUSs). The very large number of VUSs is a serious problem in diagnostics and treatment of inherited heart diseases. Methods and results We collected 233 benign or pathogenic missense variants in the superfamily of TRP channels from databases ClinVar, Humsavar and Ensembl Variation to compare performance of 22 algorithms that predict damaging variants. We found that ClinPred is the best-performing tool for TRP channels. We also used the paralogue annotation method to identify disease variants across the TRP family. In the set of 565 VUSs of hTRPM4, ClinPred predicted pathogenicity of 299 variants. Among these, 12 variants are also categorized as LP/P variants in at least one paralogue of hTRPM4. We further used the cryo-EM structure of hTRPM4 to find scores of contact pairs between parental (wild type) residues of VUSs for which ClinPred predicts a high probability of pathogenicity of variants for both contact partners. We propose that 68 respective missense VUSs are also likely pathogenic variants. Conclusions ClinPred outperformed other in-silico tools in predicting damaging variants of TRP channels. ClinPred, the paralogue annotation method, and analysis of residue contacts the hTRPM4 cryo-EM structure collectively suggest pathogenicity of 80 TRPM4 VUSs.


Introduction
The transient receptor potential channel TRPM4 (member 4 of the melastatin subfamily) is a calcium-activated ion channel that selectively permeates monovalent cations.The channel is widely distributed in various organs.In the myocardial tissue, TRPM4 is involved in cardiac conduction, pacing, the action potential repolarization and other processes.Human TRPM4 (hTRPM4) is among the most important cardiac TRP channels whose pathogenic variants are associated with cardiac arrhythmias [1,2].Mutations in the hTRPM4 gene result in progressive familial heart block type I (PFHBI), bundle-branch block (BBB), right bundle branch block, isolated cardiac conduction disease (ICCD) and Brugada syndrome [3][4][5][6].As of November 2023, the ClinVar database [7] lists 688 missense variants of hTRPM4.Among these, 633 variants are of unknown clinical significance (VUS), 31 variants have conflicting interpretation of pathogenicity (CIP), four variants are described as likely pathogenic or pathogenic (LP/P), and 20 variants are characterized as likely benign or benign (LB/B).The American College of Medical Genetics and Genomics and the Association for Molecular Pathology (ACMG/AMP) recommend in silico predictive algorithms for variants' interpretation [8].A large number of VUSs in hTRPM4 motivates employing bioinformatics to predict impact of missense variants on the channel function and distinguish those variants that may be clinically relevant.
Numerous variant interpretation tools based on different principles have been developed to predict pathogenicity and tolerance of genetic variants [9].The success rate of these tools varies from 60 to 80% [8].Each tool and its underlying algorithm have strengths and weaknesses.The ACMG/AMP guideline recommendsusing multiple software approaches for variant interpretation without specifying the number or types of algorithms.The choice of bioinformatics tools is critical for correct variant interpretation.Choosing the best-performing tool and pathogenicity threshold for a specific protein family increases reliability of pathogenicity predictions.
The performance of in silico tools may depend on the disease phenotype [10].For instance, tools MetaLR, MetaSVM, and MCap demonstrated the top performance in predicting pathogenicity for variants associated with abnormalities in the cardiovascular system [10].However, some methods yielded many false-positive and false-negative predictions of pathogenicity in individual protein families [8,9,11,12].For example, MetaSVM predicted a pathogenic effect for 75% of benign variants of the cardiac sodium channel Nav1.5 [11].Choosing a tool with a high success rate of correct predictions for specific protein families and adjusting the pathogenicity threshold allows to improve predictions [10,11].
Earlier we applied bioinformatics tools combined with the paralogue annotation method [13] to reclassify as LP/P variants numerous VUSs of sodium channel Nav1.5 and calcium channel Cav1.2 [11,12].The paralogue annotation method employs a multiple sequence alignment of functionally and structurally related proteins and focuses on residues in sequentially matching positions where a disease mutation is known for at least one family member.Then a VUS in the matching position of the channel under investigation is assumed to be a LP/P variant [11,14].
The TRPM4 channel belongs to the superfamily of transient receptor potential (TRP) channels.Some members of this family have attracted increasing attention in the past decade as promising drug targets for treatment of cardiovascular diseases [15], neurodegenerative disorders [14], inflammation [16], and Type II diabetes [17].Information about pathogenic variants of these paralogues may be useful for interpretation of uncharacterized hTRPM4 VUSs.
Here, we composed a large dataset to test performance of various predictors in identifying known LP/P and benign variants in the superfamily of TRP channels.We collected LP/P missense variants from three databases listed in section 2.2 and benign missense variants from the gnomAD database.We evaluated the performance of 22 popular bioinformatics prediction tools and found that ClinPred outperformed other tools for the superfamily of TRP channels.ClinPred and the paralogue annotation method consensually predicted that 12 VUSs of hTRPM4 may be damaging variants.We further employed the cryo-EM structure of the hTRPM4 channel [18] to find scores of contact pairs between parental (wild type) residues of likely pathogenic missense VUSs' (according to ClinPred results for both contact partners) and proposed that 68 respective VUSs can be damaging variants.We propose that 80 missense VUSs, which are described in this study, may be associated with hTRPM4 dysfunctions.

Sequence data of human channels
The hTRPM4 amino acid sequence was obtained from the UniProt database, entry Q8TD43 [19].For the paralogue annotation method, we have chosen proteins from the following subfamilies of human TRP channels: TRPC (Canonical), TRPV (Vanilloid), TRPM (Melastatin), TRPP (Polycystin), TRPML (Mucolipin), and TRPA (Ankyrin).UniProt IDs for the proteins used in the analysis are shown in Table 1.

Collection of variants
LP/P variants for hTRPM4 and its paralogues (Table 2) were collected from three databases: Humsavar (https://www.uniprot.org/docs/humsavar,updated 1March 2023)), Ensembl Variation [20] (updated 2023-03-01) and ClinVar [7] (updated 2023-04-10).Only likely pathogenic and pathogenic variants (LP/P) were extracted from databases Ensembl Variation and Humsavar.From the ClinVar database, we selected those variants, which are characterized as 'pathogenic' or 'likely pathogenic' and are associated with specific clinical conditions.Moreover, we excluded variants from pathogenic dataset which were characterized as 'LB/B' or 'VUS' in ClinVar.VUSs were extracted from the ClinVar database where field 'Clinical Significance' has words 'Uncertain significance'.Benign (neutral) variants along with their minor allele frequencies (AF) were obtained from the population database gnomAD [21].Variants with AF > 0.00005, which are absent in ClinVar, were considered benign [22,23].The number of collected LP/P, VUS, and benign variants is shown in Table 2.All variants were combined in one broad dataset (S1 Table ).

Topology of the TRPM4 channel
Domain organization and topology of the hTRPM4 channel were obtained from the cryo-EM structure (PDB ID: 5WP6) [18] in the Protein Data Bank (https://www.rcsb.org/)[24].Fulllength TRPM4 has four identical subunits, which form an inverted crown-like structure.The latter includes the transmembrane domain (TMD), a large cytosolic domain formed by the Nterminal melastatin homology regions (MHR) and the C-terminal domain (CTD) (Fig 1A and  1B).The N-terminal part of each subunit has four melastatin homology regions (MHR1-4) and Pre-S1 helixes.TMD in each subunit contains six transmembrane segments (S1-S6).Segments S5 and S6, which are linked by a large extracellular membrane reentering P-loop, contribute a quarter to the pore module.In each subunit, segments S1-S4 form a voltage-sensing module (VSM).CTD contains TRP and CTD helices (Fig 1A and 1B).The MHR domain and the helical CTD constitute a unique intracellular architecture that distinguishes TRPM4 from any other TRP channel where the N-terminal cytosolic domains mainly contain ankyrin repeats [18].

Multiple sequence alignment and paralogue annotation
The paralogue annotation method identifies LP/P missense variants by transferring annotations across families of related proteins [13].Earlier, we used a modified method of paralogue annotation to predict LP/P variants for the cardiac sodium channel hNav1.5 [12] and calcium channel hCav1.2[11].This approach is applied here to the hTRPM4 channel to select VUSs that are likely damaging variants.
For each paralogue channel, LP/P variants were collected as described in section 2.1.Amino acid sequences of hTRPM4 and paralogues channels were aligned using multiple sequence alignment program T-Coffee [25].Proteins for which no LP/P variants were found were excluded from the alignment.Each paralogue variant was mapped on the hTRPM4 sequence (S2 Table ) according to the alignment.
Disease-causing (LP/P) variants are more likely to occur at evolutionary conserved positions.Therefore we calculated the position-specific conservation score (Cs), which varies between 0 (no conservation), 0.8 (high conservation), and 1 (identical).Cs reflects the conservation of physico-chemical properties (small, polar, hydrophobic, tiny, charged, negative,  positive, aromatic, aliphatic, proline) in the sequence alignment [26].Cs values were calculated using the Zvelebil method [27] as implemented in the Amino Acid Conservation Calculation Service [28].Variants in positions with conservation scores >0.3 were considered as LP/P variants according to [12,13].
To generate binary predictions (Damaging/Tolerated), we used the thresholds, which were determined as the optimal pathogenicity threshold from the AUC-ROC curve (Table 2).The 'probably damaging' and 'possibly damaging' classes predicted by tool Polyphen were merged into a single 'damaging' class.The Mutation Assessor server subdivides mutants into four categories.Categories high ('H') or medium ('M') were treated as 'Damaging', whereas categories low ('L') or neutral ('N') were treated as 'Tolerated'.
The overall prediction performance of the 22 methods was assessed by calculating sensitivity, specificity, Matthews Correlation Coefficient (MCC), and accuracy (ACC) as follows: MCC ¼ TP � TN À FP � FN ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi The following abbreviations are used in these equations.TP (true positive) is the number of disease-causing variants correctly predicted to be pathogenic; FN (false negative) is the number of disease-causing variants incorrectly predicted as tolerated; TN (true negative) is the number of neutral variants correctly predicted as tolerated; FP (false positive) is the number of neutral variants incorrectly predicted as pathogenic; MCC is a correlation coefficient between the observed and predicted binary classification, ranging from -1 (total disagreement between prediction and observation) to 1 (perfect prediction).
For the test dataset, we have chosen 103 benign and 130 LP/P variants from our broad dataset (Table 2, S1 Table ).We also calculated the area under the ROC (Receiver Operating Characteristic) curve (AUC) using library pROC in programming language R. ROC curves were obtained by plotting sensitivity against (1 -specificity) at each threshold for each algorithm.The AUC can range from 0 (totally random) to 1 (perfectly correct prediction).We used AUC as the main measure of performance.The absence of a variant annotation negatively affects the accuracy of prediction.Thus, we have chosen only those algorithms, which predicted pathogenicity for over 30% variants in our dataset (S1 Table ).

Distribution of pathogenic variants in topological regions of hTRPM4
Most of pathogenic variants are localized in the cytoplasmic part of the channel, mainly in alpha-helices of the MHR regions.Two variants, I1033M and I1040T, were found in segment S6, and one variant (Y790H) in segment S1 (Fig 1A and 1B).Most of LP/P variants of hTRPM4 (83%) are associated with the PFHBI syndrome.Other variants are associated with erythrokeratodermia and Brugada syndrome (S1 Table ).

Comparison of in silico bioinformatics tools
We compared performance of 22 variant interpretation tools (Table 3, Fig 2).We first compiled a test set with 130 true positive (TP) observations and 103 true negative (TN) Deleterious threshold is the custom pathogenicity threshold that divides variants in two categories: pathogenic or benign.The larger or smaller the score than the threshold for specific tool, the more likely the variant is damaging.Sensitivity characterizes the number of LP/P variants, which were predicted as LP/P by the tool, while specificity characterizes the number of benign variants, which were predicted as benign by the tool.Accuracy indicates the predictive accuracy of the tool. https://doi.org/10.1371/journal.pone.0295974.t003

PLOS ONE
Predicting likely pathogenic/pathogenic variants of the TRPM4 channel observations obtained from our broad dataset (S1 Table ).AUC and ROC curves are shown in Fig 2 .For each tool, we found an optimal deleterious threshold for binary predictions based on its ROC curve, which shows the sensitivity and specificity corresponding to different score thresholds (Table 3).ClinPred demonstrated the best performance in predicting pathogenicity for variants in the TRP superfamily (accuracy = 0.83, AUC = 0.90) (Fig 2 ), followed by VEST4 (accuracy = 0.79, AUC = 0.88) and MCap (accuracy = 0.76, AUC = 0.87).With the score threshold of 0.6 ClinPred has a relatively high tendency of correctly classifying benign variants as tolerated (specificity = 0.75) and LP/P variants as pathogenic (sensitivity = 0.92) (Table 3).REVEL, PrimateAI, MVP, MetaSVM, MetaLR, LIST.S2 and Deogen2 performed with AUC of > 0.80.The AUC of other algorithms ranged from 0.70 to 0.80.The lowest accuracy across all methods was found for GenoCanyon (accuracy = 0.56, AUC = 0.62).

Paralogue annotation of variants identified in hTRPM4
We mapped the LP/P variants of paralogs onto the hTRPM4 regions basing on the multiple sequence alignments (section 2.4, S2 Table ).A total of 63 known LP/P variants in paralogues are mapped to 36 amino acid positions in the hTRPM4 channel (S2 Table ).In these positions, we found 51 variants of hTRPM4, including 50 VUSs and one benign variant.In some cases, more than one variant was mapped into one sequence position.Twenty LP/P variants in paralogues TRP channels were mapped in the MHR1/2 region of hTRPM4 (S2 Table ).The MHR1/ 2 region is ~30% of the entire sequence and it is highly variable among paralogues.Thus, highly conserved regions among paralogues are transmembrane segments S1-S6 and the TRP helix.These regions have more than 60% of conserved positions (Cs>0.3).

Consensus prediction LP/P variants in hTRPM4 by ClinPred and paralogs annotation method
Most of hTRPM4 variants in our dataset are currently classified as VUS.ClinPred predicted 299 VUSs with the pathogenicity score > 0.6 as LP/P variants.Among these, we further selected those variants, which are annotated as LP/P in at least one paralog of hTRPM4 (conservation score across paralogues Cs>0.3).Both methods consensually predicted 12 of 565 VUSs as LP/P variants (Table 4 and Fig 1).The variants are localized mainly in the transmembrane region of the channel (Fig 1).Five variants are located in transmembrane helices S4, S5 and one variant (V966I) is located in the P-loop between helices S5 and S6.Fig 3 indicates parental residues of the 12 variants in the cryo-EM structure of hTRPM4 [18].In the homotetrameric channel, each parental residue is shown four times.

Intersegment contacts between parental residues of ClinPred predicted LP/P variants
We used the cryo-EM structure of the hTRPM4 channel to find intersegment contacts between the parental (WT) residues for which ClinPred predicted a high probability of pathogenicity of respective VUSs (S3 Table ).We reasoned that if two sidechains of such residues have heavy atoms with 5 Å from each other, then mutation of any contact partner would affect the intersegment interaction and relative stability of the channel state and thus may underlie the channel dysfunction.
In the transmembrane and extracellular parts of the channel, we found 16 contacts between the parental residues and propose that respective 25 VUSs have a high probability to be LP/P variants (Table 5 and Fig 4).Most of the found contacts are within the VSM, which undergoes significant transformation in the cryo-EM structures of P-loop channels with activated and deactivated VSMs [29,30].The cryo-EM structure of hTRPM4 shows activated VSMs.Although the voltage dependency of the channel is rather weak [31], such contacts will change upon the VSM deactivation.Mutations in such contacts (annotated "VSM activation" in Table 5) would affect VSM activation/deactivation. Contacts between the sliding helix S4 and the outer helix S5 may mediate signal transmission from VSM to the pore module [32].Mutations of such contacts may cause the channel dysfunction by affecting the signal transmission.Respective contacts are annotated "VSM i -PM i+1 signal" in Table 5.
We also found intersegment contacts whose state-dependency is unclear (between P-loop helices P1 and P2) or unlikely (between helices S5 and P1).In practically all P-loop channels,

PLOS ONE
Predicting likely pathogenic/pathogenic variants of the TRPM4 channel the C-terminal part of the outer helices (S5) has a small residue (G, A, or S), which is involved in a knob-into-hole contact with a bulky residue at the N-end of P-helix of the same subunit/ repeat [33].Mutations in such contacts, which would affect folding of the respective subunit/ repeat in the pore module, are annotated "PM i folding" in Table 5.Four intra-subunit contacts between P-loop helices P1 and P2 are involved in stabilization of the P-loop folding (annotated "P-loop folding" in Table 5).In the Nav1.5 channel, disease mutations in such contacts affect slow inactivation [34].Structural determinants of TRPM4 inactivation are unclear, but in the TRPM2 channel inactivation gate is located in the extracellular selectivity filter [35].
We also found four inter-subunit contacts at the extracellular part of the channel between residues in the VSM helix S1 and PM helix S5.The state-dependency of such contacts is unclear, but respective interactions likely contribute to stabilizing mutual disposition of VSMs and PM.These contacts are annotated "VSM i -PM i+1 Clamp" in Table 5.
We further found 24 intersegment contacts in the cytoplasmic parts of the channel between parental residues of VUSs with high ClinPred score of pathogenicity.These contacts involve 43 residues (Table 6 and Fig 5).In lack cryo-EM structures with different states of the cytoplasmic parts, which are unique in TRP channels, state-dependency of such contacts is unclear.Nevertheless, the fact that some experimentally confirmed pathogenic mutations are located in a All contacts are within single subunit; No intersubunit contact were found b GoF [53] c Gain-of-expression and gain of-function [52] https://doi.org/10.1371/journal.pone.0295974.t006

Discussion
Over 90% of variants in the hTRPM4 channel are classified as VUSs.The large proportion of VUSs is a serious problem because patients with respective mutations are not clinically actionable and incorrect interpretation of the pathogenicity of the variant has serious ramifications for assessing a probability of sudden cardiac death [36][37][38].The ACMG/AMP guidelines suggest using in silico prediction tools for variant interpretation.Due to their low accuracy, computational approaches are still considered as rather weak supporting evidence compared to functional methods.Some of the early popular variant interpretation tools like SIFT are relatively simple and rely on sequence homology and physico-chemical properties of amino acids [39].Other earlier tools use one or more structure-based characteristics such as effects of mutations on the stability, folding and dynamics of proteins, e.g.I-Mutant 2.0 [40], FoldX [41], and CUPSAT [42] or combine sequence and structure characteristics, e.g.PolyPhen-2 [43].Most of these tools use machine learning methods trained with numerous biochemical features and evolutionary constraints or pathogenic classification data that can be collected from structure-based databases, e.g. the thermodynamic database for proteins and mutants [44] and variant pathogenicity databases such as SWISS-PROT [20] and ClinVar [7].Recently developed tools like REVEL, MetaSVM, and MCap employ multiple scores from different tools into a single ensemble score with improved prediction capabilities [45].ClinPred, besides a wide range of existing approaches, uses allele frequency from gnomAD as one of the key predictive features [46].Our recent analysis revealed that various popular bioinformatics tools yield different predictions of pathogenicity for known LP/P and benign variants of the hCav1.2channel and its paralogues [11].We have shown that for each protein family it is necessary to select a specific best-performing predictor from a variety of existing methods.This is possible if databases describe rather large number of pathogenic and neutral variants for the given family of proteins.In the present study, we have shown that for the TRP superfamily, ClinPred is the most accurate method with accuracy of 0.83 and AUC of 0.90.
We further used the paralogue annotation method, which employs multiple sequence alignment and data on missense variants associated with diseases across the protein family.This approach was developed and experimentally validated with a large set of known variants in eight genes associated with the long QT syndrome, and demonstrated positive predictive value of 98.4% in these genes [47].The accuracy of the method depends on the quality of the protein sequence alignment, the conservation score Cs in each sequence position among paralogues, and the quality of the evidence relating genotype to phenotype for the paralogue variant [13].The lack of paralogue annotation for an uncharacterized variant in the protein in question does not mean the variant is non-pathogenic.In other words, the paralogue annotation method is based on clinical genotype-phenotype relationships in humans, rather than on computational prediction.However, when the paralogue annotation method is applied along with a bioinformatics tool, the consensus predictions of pathogenicity become much more reliable than predictions from individual methods.
The hTRPM4 channel belongs to the transient receptor potential (TRP) superfamily.The TRPM family, which has eight members, is the largest and most diverse subfamily of the TRP channels [48].Since very few LP/P variants of the TRPM channels were found in public databases, we also considered data on variants from other TRP family members (Table 1).However, only TRPA1, TRPC6, TRPV4, and TRPV6 channels, for which one or more LP/P variantsare described, were included to our big dataset (Table 2, S1 Table ).
All hTRPM4 paralogues are homotetramers with conserved TMD and TRP domains.TRPM channels also have a MHR domain in the N-terminal region, which distinguishes them from other TRP channels where the N-terminal cytosolic domains contain mainly ankyrin repeats.Therefore, TRPM4 residues in the MHR region have low conservation scores (S2 Table ).The MHR domain is subdivided into four melastatin homology regions (MHR1-MHR4) based on sequence similarity within the TRPM family [49].Most of known LP/P variants are localized in MHR1/2 domain (Fig 1) and are associated with the PFHBI-IB disease [48].Domain MHR1/2 consists of a β-sheet core surrounded by α helices and loops (Fig 1).It strongly interacts with domain MHR3 from the same subunit, and has rather weak contacts with domain MHR3 in the adjacent subunit.The interface between domains MHR1/ 2 and MHR3 forms a binding pocket for decavanadate, a negatively charged metal cluster that shifts the voltage dependent activation of the hTRPM4 channel towards negative potentials [18,50].It was proposed that LP/P variants in domain MHR1/2 may disturb sensing of external stimuli by the channel [18].
Using ClinPred and paralogue annotations, we predicted that 12 VUSs of the hTRM4 are LP/P variants.All these variants are located in the MHR and TMD regions.Two variants are at residue positions 437 and 445 in the α14 helix of MHR3 (Fig 1).In the cryo-EM structure of hTRPM4 [18], helices α14 and α13 form an interface, and variants R437W and I445M may affect the helices movement and destabilize the MHR3 domain [51].The stacked α-helices of MHR3-4 form interfaces with the C-terminal TRP domain, thus linking the N-and C-termini and additionally providing a direct interaction between the cytosolic regions of the channel and the transmembrane core.Variants R664H and R664L are located in domain MHR4 that interacts with the TRP domain and the S2-S3 linker helix of TMD on the top, and with the rib helix of CTD and MHR3 on the bottom.Mutated residues N913S and G917R in the S4-S5 linker form hydrogen bonds with residues in the TRP domain [18].Variants R664H, R664L, N913S, and G917R, which we characterized as LP/P, may affect interactions between MHR, CTD, and TMD.Three variants, F936I, F936L and V966L, were found in the S5 segment and P-loop region, which belongs to the pore domain.These variants may influence gating of TRPM4 and calcium permeability.
Among 10 pathogenic variants reported in public databases, only two missense variants are located in the TM region: K914R andI1033M.Side chain of K914 is unresolved in the cryo-EM structure.I1033 makes multiple intersubunit contacts with hydrophobic residues (F933 in P1, F975 in S5, L1039 and L1043 in S6), but none of these residues is reported in public databases or has a high score by ClinPred, suggesting that I1033M may affect channel expression rather than the channel function.
Among six VUSs located in the transmembrane region that ClinPred and paralogues annotation method consensually predicted as LP/P variants (I909V, N913S, G917R, F936I, F936L and V966I), only parental residue I909 forms intersegment contact with a residue whose pathogenic variant Y790C is reported in public databases.Y790 forms an intersegment contact with R905, the only basic residue in the voltage-sensing sliding helix S4 (Table 5).A recent study identified TRPM4 variant Y790C in patients suffering from complete heart block and their relatives and demonstrated gain-of-expression and gain-of-function of mutant channel Y790C [52].The same study also confirmed pathogenic status of variant L1113V and demonstrated gain-of-expression and gain-of-function of mutant channel L1113V.These findings exemplify predicting potential of our approaches.

Conclusions
In this study we compiled and analyzed a broad dataset that includes all currently known pathogenic and likely pathogenic variants in the hTRPM4 channel and its seven paralogues.We found that ClinPred is the best-performing bioinformatics tool to predict likely pathogenic/ pathogenic (LP/P) variants for TRP channels.ClinPred and the paralogue annotation method consensually predicted 12 hTRPM4 VUSs as potentially pathogenic variants.We further used a cryo-EM structure of the hTRPM4 channel to analyze intersegment contacts of 307 parental (wild-type) residues, which have a high ClinPred score of pathogenicity.We found scores of contact pairs between the WT residues whose mutations may affect the protein structure.Taken together, our approaches predicted that a total of 80 VUSs are likely damaging variants.The latter number is much larger than 10 pathogenic variants currently reported in Humsavar, ClinVar and Ensembl Variation.The 80 variants are promising targets for further experimental and theoretical studies.

Fig 2 .
Fig 2. ROC curves for prediction algorithms on the broad dataset.This plot illustrates performance of quantitative predictions.The higher AUC score indicates the better performance.https://doi.org/10.1371/journal.pone.0295974.g002

Fig 3 .
Fig 3. Channel hTRPM4.(A) Topology of transmembrane helices.VUSs with high ClinPred and paralogue annotation scores are indicated by their numbers in Table 4.The number of variants in the full-fledged channel is four times larger than that in a subunit.(B) Side and intracellular views of the cryo-EM structure [18].Subunits are colored as top bars in (A).Positions of hTRPM4 VUSs are shown as spheres.https://doi.org/10.1371/journal.pone.0295974.g003

Fig 5 .
Fig 5. Cytoplasmic part in the cryo-EM structure of hTRPM4.Shown are intersegment contacts between parental residues of ClinVar reported VUSs that according to ClinPred have a high probability to be LP/P variants.See Table6for the list of contacts.

Table 2 . Known variants in the hTRPM4 channel and its paralogues. Gene a Uniprot ID b LP/P c VUS d Benign e
a Genes of the TRP-superfamily, which contain one or more LP/P variant in public databases b Protein index in pathogenic variants c LP/P is Likely Pathogenic/Pathogenic variant d Variants of unknown clinical significance e Variant from the gnomAD database, which occurs in a population with allele frequency >0.00005 and are absent in the ClinVar database.https://doi.org/10.1371/journal.pone.0295974.t002

Table 5 . Intersegment contacts in the cryo-EM structure between WT residues in the transmembrane and P-loop regions with high ClinPred score a .
[52] following variant conditions are reported in ClinVar: Progressive familial heart block type IB (15 variants), Cardiovascular Phenotype (7 variants), or Not Provided (2 variants) b Each contact is indicated twice to show both contacts partners.cGain-of-expressionandgainof-function[52].https://doi.org/10.1371/journal.pone.0295974.t005Fig 4. Subunit interface in transmembrane and extracellular region in the cryo-EM structure of hTRPM4.Shown are intersegment contacts between parental (WT) residues of ClinVar listed VUSs that according to ClinPred have a high probability to be damaging variants.See Table 5 for the list of contacts.https://doi.org/10.1371/journal.pone.0295974.g004

Table 6
[18]on (Fig1) suggests functional importance of such contacts.For example, LP/P variants in domain MHR1/2 may disturb the channel sensing of external stimuli[18].